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A  question  of  interest  to  many  is  just  how  people  make  decisions. 
This  involves  issues  such  as  how  people  select  data,  organize  it,  pro- 
cess it  and  what  decision  models  they  use.   Studies  have  been  made  of 
topics  such  as  the  degree  of  self-insight  people  have  into  their  decision 
process,  the  extent  of  functional  fixation  and  the  presence  and  nature 
of  processing  biases. 

Two  basic  models  have  been  explored  in  the  literature.    The  one 
derives  from  Bayes  Theorem.   The  other  relies  on  the  Brunswick  Lens 

model.   Both  are  clearly  sub-models  of  the  more  general  decision  theory 

2 
approach.    The  two  have  recently  been  contrasted  by  Wright  (1977) .   This 

paper  leans  heavily  on  his  work  and  seeks  to  extend  and  deepen  his  anal- 
ysis. 

Much  of  the  empirical  work  involving  these  two  models  has  relied 
on  laboratory  studies  using  students  as  subjects.   The  extent  to  which 
results  derived  in  this  manner  can  be  generalized  has  been  questioned 
in  a  general  methodological  sense  by  several  writers.   The  analysis  in 
this  paper  suggests  that  perhaps  many  of  the  empirical  conclusions  fol- 
low from  subjects  who  have  uniform  priors  and  make  utility  free  judge- 
ments.  If  so,  the  conclusions  indeed  might  not  be  generalizable. 

The  Brunswick  Lens  Model 

Brunswick  proposed  his  Lens  Model  as  a  method  to  relate  environ- 
mental and  individual-specific  variables  in  the  decision  process.   Typ- 
ically the  subject  is  provided  with  K  realizations  of  N  stimuli  or  equiv- 

4 
alently  a  N  x  K  cue  matrix  X.    Actual  realizations  of  an  observable 

variable  y  are  then  compared  with  the  subject  i's  estimates,  conditional 

on  X,  y..   These  are  two  N  x  1  vectors.   Inter  alia  the  two  sets  of  de- 

—   l  

pendent  variables  are  correlated  creating  the  so-called  achievement 
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In  general,  we  have  y  =  f  (X)  and  y.  =  f . (X) .   However,  it  is 

e    e  —       1    1  - 


usually  assumed  that  these  functions  are  linear  so  that 

K  =  ±fie        '    K      =   N  X  lr  K      :   K  X  1 
y.  =  X  A.    ;y.   :   N  x  l,fe.       ■      K  x  1 

The  vectors  of  beta  weights  are  assumed  to  indicate  the  relative  impor- 
tance of  each  of  the  stimuli  for  the  subject  and  in  the  environment. 

There  is  a  question  here  whether  the  linear  models  so  derived  are 
in  some  sense  actually  used  by  the  subject  or  whether  they  more  properly 
constitute  an  ex  post  fit  only.   The  question  is  probably  irrelevant 
since  what  we  are  after  is  an  ability  to  predict  a  subject's  response 
and  not,  even  were  it  possible^ to  trace  the  exact  synaptic  chain  in  the 
neurological  system. 

It  is  of  interest  to  note,  however,  that  the  linearity  assumption 
appears  to  work  remarkably  well  in  its  description  of  human  judgement 
in  these  empirical  studies.   Wright  (1977) ,  for  example,  reports 
R-squared's  based  on  a  linear  regression  of  the  subjective  judgements 
against  the  stimuli  as  high  as  95%.   Clearly  this  is  in  part  due  to  the 
actual  environmental  response  being  a  linear  function  of  the  stimuli  or 
cues.   However,  even  where  the  actual  function  appears  to  be  non-linear, 
subjects  still  appear  to  rely  heavily  on  linear  judgemental  models.   Why 
this  should  be  so  is  patently  a  matter  of  interest. 

Most  of  the  studies  involving  the  Lens  model  have,  however,  taken 
place  in  laboratory  conditions.   This  is  almost  inevitable  given  the 
nature  of  the  research.   This  does  though  raise  the  very  real  question 
as  to  whether  these  results  can  be  generalized.   Stated  slightly  dif- 
ferently, what  implicit  assumptions  might  be  present  in  these  laboratory 


studies  that  might  not  be  present  in  the  real  world?   More  generally, 
what  is  the  relationship  of  this  approach  to  the  more  general  decision 
theory  approach? 

Bayesian  Probability: 

We  are  on  firmer  theoretical  ground  here,  albeit  weaker  empirical 
ground,  than  with  the  Lens  Model.   For  good  mathematical  sense  we  re- 
quire that 

Pr(y  \  x,  )  =  Pr(y  )Pr(x  \    y  ) 

n>  — k        n    — k   n (3) 

V  .  .Pr(y.)Pr(x  \  y.) 
u  D=l    :    ~k  '   : 

Or,  in  words,  the  probability  of  a  given  outcome  conditioned  on  a  set  of 
stimuli  x   is  proportional  to  the  unconditional  probability  of  that  out- 
come multiplied  by  the  probability  of  the  set  of  stimuli  conditioned  on 
that  outcome. 

The  first  of  these  probabilities,  the  unconditional  probability, 
is  most  commonly  known  as  the  prior  probability  of  the  outcome.   This  is 
the  probability  assessment,  in  our  context  subjective,  of  course,  of  the 
occurrence  of  that  particular  outcome.   The  second  of  these  probabilities 
is  known  as  the  likelihood  function.   It  is  the  likelihood  of  that  set 
of  stimuli  given  the  occurrence  of  that  particular  outcome. 

Typically  subjects  are  asked  first  to  provide  their  priors  on  a 
set  of  outcomes  and  then  to  provide  their  posterior  probabilities  for 

the  same  set  of  outcomes  based  on  the  new  information  fed  to  them. 
Wright's  survey  reports  that  the  results  are  mixed.   For  one,  people  do 
not  appear  to  revise  their  priors  sufficiently,  i.e.,  their  revisions 
are  conservative  when  compared  with  the  Bayesian  rule. 
Decision  theory: 

The  two  paradigms  as  they  stand  are  not  really  comparable.   In  es- 
sence, the  one  requires  the  subject  to  arrive  at  a  best  estimate  of  an 


outcome  y  given  a  data  vector  jc.   The  other  essentially  requires  the 

subject  to  arrive  at  an  estimate  of  the  probability  of  a  given  outcome 

y   conditioned  on  the  same  data  vector  x,  .   Neither  makes  explicit  men- 
n  — k  ^ 

tion  of  a  loss  or  utility  function. 

A  more  complete  analysis  would  involve  a  combination  of  probabilities 
and  utility  functions.   In  other  words,  decision  theory  would  predict 
that  in  general  people  would  select  a  given  outcome  based  on  the  expected 
utility  of  that  choice.   By  way  of  example,  consider  the  following  il- 
lustration drawn  from  Wonnacott  and  Wonnacott  (1972) .   We  are  asked  to 
estimate  the  length  of  a  beetle  given  that  our  priors  about  the  species 
are  a  mean  length  of  25  mm.  with  a  variance  of  4  mm. ,  the  distribution 
being  assumed  normal.   Suppose  now  that  a  sample  of  10  beetles  yields  an 
average  of  20  mm.  and  a  variance  of  10  mm.   A  classical  estimate  of  the 
95%  confidence  interval  would  be 

9  =  X  +1.96  r 
n 

=  20  +  1.96 

However,  it  can  be  shown,  and  they  do,  that  the  posterior  distribution 

is  normal  with 

p(e|X)  =  N  J  W,X  +  W  9        1 

_1 2  o   

•> 
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where   the   zero   subscript   indicates   a  prior  and 
W     =        1 


rr2  n 


W.   =  -A. 


v2 


It  follows  that  the  Bayesian  mean  is 

X  =  1(20)  +  1/4(25)  =  21 

1  +  1/4 

with  a  variance  =    1     =  0.8 

1  +  1/4 

To  these  results  we  now  need  to  add  a  loss  function.   It  is  well  known 
that  a  quadratic  loss  function  implies  a  mean  as  an  optimal  estimate. 
The  best  estimate  then,  given  a  revision  of  priors  and  a  quadratic  loss 
function,  is  21  as  opposed  to  the  earlier  point  estimate  of  20. 

What  is  also  apparent  from  this  analysis  is  the  fact  that  as  one's 
priors  become  more  diffuse  so  the  second  weight  W  goes  to  zero.   The 
posterior  then  reduces  to  the  classical  formulation 

P(Q)X)  =  N(X,  V^/n) 
Then,  given  a  quadratic  loss  function,  the  best  estimate  is  the  sample 
mean  of  20.    To  arrive  at  this  requires,  however,  the  double  assumption 
of  a  suitable  loss  function  and  uniform  or  diffuse  priors. 

The  Bayesian  paradigm  does  not  call  for  an  estimate,  merely  the 
posterior  distribution  of  all  possible  estimates.   As  such,  it  does  not 
involve  the  use  of  loss  functions,  as  indeed  it  should  not,  if  the  sub- 
ject  is  Savage  rational. 

The  lens  paradigm,  on  the  other  hand,  does  call  for  an  estimate 
although  it,  in  turn,  makes  no  assumptions  about  either  a  loss  function 
or  probability  distributions,  whether  prior  or  posterior.   However,  it 
is  well  known  that  OLS  involves  the  minimization  of  the  square  of  the 
errors.   This  implies  a  use  of  quadratic  loss  functions  -  or  at  least  this 
is  one  way  that  one  can  interpret  the  resulting  linear  function. 


...(7) 


Maximum  Likelihood  Estimates: 

There  is  an  alternative  approach  to  all  of  this  involving  maximum 
likelihood  estimators  which  might  shed  light  on  the  empirical  results 
obtained  to  date  in  the  Lens  studies.   It  also  enables  one  to  link  the 
Bayesian  approach  more  closely  to  the  Lens  approach. 

Consider  first  an  individual  who  is  asked  to  come  up  with  a  best 
judgement  of  some  parameter  or  dependent  variable  9.   Assume  further  that 
he  or  she  is  provided  with  a  data  vector  X.   Finally,  assume,  most  impor- 
tantly that  the  subject  makes  a  utility  free  judgement.   This  is  essen- 
tially what  subjects  are  required  to  do  in  the  Bayesian  paradigm.   Alter- 
natively stated,  they  are  basically  asked  to  choose  that  ©  which  maximizes 

Pr(9\X,H)   =  Pr(©\H)Pr(x\e,H)  /  pr (X  /  H) 

where  H  is  the  prior  information  available  to  the  subject.   But,  since 

the  denominator  of  (8)  is  a  constant  this  reduces  to 

max  Pr(©|H)Pr(X)e,H) 
© 

But  this  is  precisely  the  maximum  likelihood  estimator. 

Thus,  the  first  result  that  we  arrive  at  is  that  if  the  subject's 
priors  are  vague,  the  Bayesian  decision  rule  collapses  into  a  Maximum 
Likelihood  Estimate.   Whether  people  actually  do  use  MLE's  is,  of  course, 
a  matter  for  empirical  investigation.   However,  it  is  highly  probable  that 
most  students,  who  form  the  bulk  of  the  laboratory  population,  do  indeed 
have  very  vague  priors  about  the  items  they  are  asked  to  estimate.   More- 
over, in  the  absence  of  any  real  incentive  or  punishment,  it  is  not  at  all 
unlikely  that  they  generate  fairly  utility  free  judgements.   Given  these 
two  probabilities,  it  would  not  be  surprising  to  find  them  seeking  for 
an  estimate  or  judgement  which  is  in  some  sense  "most  likely"  -  that 
value  most  consistent  with  the  data.   In  other  words,  it  would  not  be 


entirely  improbable  to  find  that  laboratory  subjects  were  using  Maximum 
Likelihood  estimates. 

This  surmise  is  further  strengthened  if  we  make  the  further  assump- 
tion that  the  data  vector  X   is  normally  distributed,  or  at  least,  ap- 
proximately normal  in  its  distribution.   It  is  well  known  that  in  this 
case  the  ML  estimate  is  identical  to  the  OLS  estimate  and  that  both  are 
linear.   This  might  be  true  for  the  data  itself  or  for  the  subject's 
opinion  of  the  data  or  both.    All  that  we  require  is  for  the  subjective 
perception  of  the  data  to  be  reasonably  normal  to  arrive  at  a  linear 
judgemental  rule. 

Conclusion; 

What  we  have  arrived  at  then  is  the  surmise  that  if  certain  assump- 
tions hold,  a  linear  judgemental  model  is  wholly  to  be  expected.   These 
assumptions  are  1)  that  a  utility  free  judgement  be  made  2)  that  the 
subject  have  fairly  uniform  priors  3)  that  the  data  be  perceived  to  be 
normally  distributed  and  4)  that  a  maximum  likelihood  estimator  is  being 
used. 

However,  in  the  real  world  of  harsh  risks  and  pleasant  rewards,  it 
is  highly  unlikely  that  any  of  these  assumptions  will  be  met.   Judge- 
ments are  rarely  utility  free.   Subjects  have  considerable  priors  in 
their  areas  of  competence.   Some  would  even  label  them  fixations.   The 
data  might  or  might  not  be  normally  distributed.   However,  even  if  it 
were,  in  the  absence  of  quadratic  loss  functions,  a  linear  decision  rule 
would  be  improbable. 

The  question  then  arises  can  we  trace  at  a  slightly  deeper  level 
what  might  be  going  on.   Can  we  design  experiments  to  check  whether  the 
above  assumptions  are  indeed  being  met  in  the  laboratory?  Can  we  perhaps 
enrich  the  environments  by  more  information  and/or  rewards  and  punishments 


to  test  what  happens  when  they  are  not  met?  And  can  we  observe,  more 
closely,  some  actual  decisions  to  test  whether  the  linear  decision  rule 
is  merely  a  laboratory  animal  or  more  widely  used?  And,  if  so,  why? 


Footnotes 

1)  A  full  survey  of  the  literature  together  with  a  bibliography  may  be 
found  in  Wright  (1977)  . 

2)  A  good  discussion  of  decision  theory  especially  as  it  relates  to  ac- 
counting may  be  found  in  Demski  (1972) . 

3)  See  Brunswick  (1952,  1956).   Also  Castellan  (1972,  1973),  Dudycha  and 
Naylor  (1966)  ,  Hammond,  Hursch,  and  Todd  (1964) ,  Hursch,  Hammond  and 
Hursch  (1964) ,  Stewart  (1976)  and  Tucker  (1964) . 

4)  A  letter  with  a  bar  underneath  it  indicates  a  vector  or  a  matrix  de- 
pending on  the  context. 

5)  An  absolute  loss  function  suggests  the  median  as  an  optimal  estimate 
while  a  (0,  1)  loss  function  suggests  a  mode.   Where  the  distribution 
is  symmetric,  these  will,  of  course,  coincide  with  the  mean. 

6)  For  more  details,  see  Savage  (1954) . 

7)  A  "true"  Bayesian  will  not  admit  of  the  existence  of  distribution 
other  than  a  subjective  one  so  that  this  formulation  would  be  redun- 
dant. 
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